Below are the datasets I plan to use for my story.
https://data-msdis.opendata.arcgis.com/datasets/mo-public-school-districts/explore

https://data-msdis.opendata.arcgis.com/datasets/mo-2020-public-schools/explore

We always hear about the urban rural divide so I wanted to highlight differences between more populated counties and more rural counties. More speciifically looking at rural vs urban school districts.
I'd like to examine student/teacher ratios and school density by land area.

Target Audience

My target audience is Missouri residents who are interested in education and differences between rural and urban areas. My hope with this project is to provide insights into the differences rural and urban counties experience pertaining to educational facilities and resources. People viewing this story will have a better understanding of possible challenges districts face regarding class size and district transportation.

Loading packages and data

from pathlib import Path
import urllib.request
import shutil
import geopandas as gpd
import pandas as pd
import json
import folium
from folium.plugins import MarkerCluster
from branca.colormap import linear
pd.set_option('display.max_columns', None)
file_url = 'https://services2.arcgis.com/kNS2ppBA4rwAQQZy/ArcGIS/rest/services/MO_Public_School_Districts/FeatureServer/0?f=pjson'

local_file_name = 'MO_Public_School_Districts.json'

file_path = Path('../exercises/')
file_path /= local_file_name

with urllib.request.urlopen(file_url) as response, file_path.open(mode = 'w+b') as out_file:
    shutil.copyfileobj(response, out_file)
mo_districts = gpd.read_file('MO_Public_School_Districts.shp', layer = 0)
mo_schools = gpd.read_file('MO_2020_Public_Schools.shp', layer = 0)

Exploratory Analysis

mo_districts.head()
FID STATEFP ELSDLEA GEOID NAME DIST_NAME DIST_CODE CODIST COUNTY LSAD10 LOGRADE HIGRADE MTFCC SDTYP FUNCSTAT ALAND AWATER INTPTLAT INTPTLON Area_SqMil Shape__Are Shape__Len geometry
0 1 29 None 2920490 Maryville R-II School District Maryville R-II 074201 074-201 Nodaway 00 PK 12 G5420 None E 324365817 612177 +40.3426688 -094.8788717 125.453015 5.600208e+08 119623.574639 POLYGON ((-94.92889 40.43306, -94.92791 40.433...
1 2 29 None 2903480 Atlanta C-3 School District Atlanta C-3 061150 061-150 Macon 00 KG 12 G5420 None E 418773072 7996637 +39.8997878 -092.5182534 164.652172 7.260594e+08 204808.140357 POLYGON ((-92.73476 40.00487, -92.73175 40.004...
2 3 29 None 2918540 Liberty 53 School District Liberty 53 024090 024-090 Clay 00 PK 12 G5420 None E 212701169 1418667 +39.2551406 -094.3973310 82.635664 3.575781e+08 122716.191316 POLYGON ((-94.49204 39.30984, -94.49204 39.310...
3 4 29 None 2928430 South Callaway Co. R-II School District South Callaway Co. R-II 014130 014-130 Callaway 00 PK 12 G5420 None E 498480796 9610382 +38.7552288 -091.8221408 196.068682 8.365335e+08 190280.448659 POLYGON ((-91.64232 38.84370, -91.64235 38.843...
4 5 29 None 2931440 Waynesville R-VI School District Waynesville R-VI 085046 085-046 Pulaski 00 PK 12 G5420 None E 485531102 4910959 +37.7655497 -092.1553666 189.235301 7.860890e+08 187724.141675 POLYGON ((-92.28531 37.90125, -92.28518 37.901...
mo_schools.head()
FID CtyDist SchNum SchID Facility Address Address2 City State ZIP County Phone FAX BGrade EGrade Principal PrinTitle Teachers Enrollment Email Latitude Longitude Loc_Code geometry
0 1 010093 3040 010093-3040 Smithton Middle 3600 W Worley None Columbia MO 652034679 Boone 5732143260 5732143261 06 08 Mr. Chris Drury Principal 70 751 cdrury@cpsk12.org 38.959750 -92.388944 MAP_MU POINT (-92.38895 38.95976)
1 2 115115 4470 115115-4470 Dewey School-Internat'L. Studies 815 Ann Avenue None St. Louis MO 631044134 St. Louis City 3146454845 3146455926 PK 05 Mr. Andrew Donovan Principal 31 433 ANDREW.DONOVAN@SLPS.ORG 38.630979 -90.302469 MAP_MU POINT (-90.30240 38.63107)
2 3 115115 4480 115115-4480 Dunbar and Br. 1415 N Garrison Avenue None St. Louis MO 631061506 St. Louis City 3145332526 3145330269 PK 06 Mr. Anthony Virdure Principal 16 157 ANTHONY.VIRDURE@SLPS.ORG 38.645176 -90.220644 MAP_MU POINT (-90.22058 38.64526)
3 4 115906 1945 115906-1945 Grand Center Arts Academy High 711 N. Grand Avenue None St. Louis MO 631031029 St. Louis City 3145331791 None 09 12 Ms. Ashley Olson Head of School 51 390 ashley.olson@confluenceacademies.o* 38.640595 -90.230978 MAP POINT (-90.23080 38.64067)
4 5 115916 6980 115916-6980 Gateway Science Acad/St. Louis 6576 Smiley Avenue None St. Louis MO 631392425 St. Louis City 3149327513 3149327514 K 05 Mr. Nuh Celik Principal 34 424 ncelik@gsastl.org 38.606788 -90.302452 MAP POINT (-90.30239 38.60687)
mo_schools.describe()
FID Teachers Enrollment Latitude Longitude
count 2392.000000 2392.000000 2392.000000 2392.000000 2392.000000
mean 1196.500000 36.155936 379.226589 38.406679 -92.391508
std 690.655244 25.474640 336.203921 0.982586 1.718383
min 1.000000 0.000000 0.000000 36.042349 -95.517960
25% 598.750000 20.000000 157.000000 37.606746 -94.142487
50% 1196.500000 32.000000 319.500000 38.638314 -92.544424
75% 1794.250000 44.000000 485.250000 39.033227 -90.534994
max 2392.000000 292.000000 2408.000000 40.551358 -89.336075
mo_districts.describe()
FID ALAND AWATER Area_SqMil Shape__Are Shape__Len
count 556.000000 5.560000e+02 5.560000e+02 556.000000 5.560000e+02 556.000000
mean 278.500000 3.533803e+08 4.756741e+06 125.329984 5.294306e+08 135595.656717
std 160.647648 2.434082e+08 9.478520e+06 97.422157 4.131485e+08 69290.410992
min 1.000000 5.175745e+06 0.000000e+00 0.003682 1.580583e+04 535.917846
25% 139.750000 1.802593e+08 5.919828e+05 59.426830 2.525088e+08 93420.363093
50% 278.500000 2.957942e+08 1.708151e+06 104.898316 4.427092e+08 130482.861336
75% 417.250000 4.629258e+08 4.955277e+06 166.577938 7.033748e+08 176140.129775
max 556.000000 1.308389e+09 1.117709e+08 507.046949 2.274117e+09 371187.544151

Data Carpentry

totalTeachers = mo_schools.groupby('CtyDist')['Teachers'].sum().to_frame().reset_index()
totalTeachers.columns = ['DIST_CODE', 'Teachers']
totalTeachers.head()
DIST_CODE Teachers
0 001090 31
1 001091 251
2 001092 31
3 002089 65
4 002090 21
totalStudents = mo_schools.groupby('CtyDist')['Enrollment'].sum().to_frame().reset_index()
totalStudents.columns = ['DIST_CODE', 'Students']
totalStudents.head()
DIST_CODE Students
0 001090 246
1 001091 2549
2 001092 148
3 002089 365
4 002090 187
SchoolsPerDist = mo_schools.groupby('CtyDist').size().to_frame().reset_index()
print(SchoolsPerDist.shape)
SchoolsPerDist.head()
(558, 2)
CtyDist 0
0 001090 2
1 001091 5
2 001092 2
3 002089 3
4 002090 1
SchoolsPerDist.columns = ['DIST_CODE', 'SchoolsPerDist']
SchoolsPerDist.head()
DIST_CODE SchoolsPerDist
0 001090 2
1 001091 5
2 001092 2
3 002089 3
4 002090 1
print(SchoolsPerDist.shape)
SchoolsPerDist.dtypes
(558, 2)
DIST_CODE         object
SchoolsPerDist     int64
dtype: object
print(mo_districts.shape)
mo_districts.dtypes
(556, 23)
FID              int64
STATEFP         object
ELSDLEA         object
GEOID           object
NAME            object
DIST_NAME       object
DIST_CODE       object
CODIST          object
COUNTY          object
LSAD10          object
LOGRADE         object
HIGRADE         object
MTFCC           object
SDTYP           object
FUNCSTAT        object
ALAND            int64
AWATER           int64
INTPTLAT        object
INTPTLON        object
Area_SqMil     float64
Shape__Are     float64
Shape__Len     float64
geometry      geometry
dtype: object
districts = pd.merge(mo_districts, SchoolsPerDist, how='left')

districts = pd.merge(districts, totalTeachers, how='left')

districts = pd.merge(districts, totalStudents, how='left')
districts.head()
FID STATEFP ELSDLEA GEOID NAME DIST_NAME DIST_CODE CODIST COUNTY LSAD10 LOGRADE HIGRADE MTFCC SDTYP FUNCSTAT ALAND AWATER INTPTLAT INTPTLON Area_SqMil Shape__Are Shape__Len geometry SchoolsPerDist Teachers Students
0 1 29 None 2920490 Maryville R-II School District Maryville R-II 074201 074-201 Nodaway 00 PK 12 G5420 None E 324365817 612177 +40.3426688 -094.8788717 125.453015 5.600208e+08 119623.574639 POLYGON ((-94.92889 40.43306, -94.92791 40.433... 5 166 1465
1 2 29 None 2903480 Atlanta C-3 School District Atlanta C-3 061150 061-150 Macon 00 KG 12 G5420 None E 418773072 7996637 +39.8997878 -092.5182534 164.652172 7.260594e+08 204808.140357 POLYGON ((-92.73476 40.00487, -92.73175 40.004... 2 34 210
2 3 29 None 2918540 Liberty 53 School District Liberty 53 024090 024-090 Clay 00 PK 12 G5420 None E 212701169 1418667 +39.2551406 -094.3973310 82.635664 3.575781e+08 122716.191316 POLYGON ((-94.49204 39.30984, -94.49204 39.310... 20 1015 12815
3 4 29 None 2928430 South Callaway Co. R-II School District South Callaway Co. R-II 014130 014-130 Callaway 00 PK 12 G5420 None E 498480796 9610382 +38.7552288 -091.8221408 196.068682 8.365335e+08 190280.448659 POLYGON ((-91.64232 38.84370, -91.64235 38.843... 4 90 781
4 5 29 None 2931440 Waynesville R-VI School District Waynesville R-VI 085046 085-046 Pulaski 00 PK 12 G5420 None E 485531102 4910959 +37.7655497 -092.1553666 189.235301 7.860890e+08 187724.141675 POLYGON ((-92.28531 37.90125, -92.28518 37.901... 10 478 6163
districts['StudentsPerTeacherDistrictAVG'] = districts.Students/districts.Teachers
districts['SqMilesPerSchool'] = districts.Area_SqMil/districts.SchoolsPerDist
districts.sort_values('SqMilesPerSchool', ascending = False).head()
FID STATEFP ELSDLEA GEOID NAME DIST_NAME DIST_CODE CODIST COUNTY LSAD10 LOGRADE HIGRADE MTFCC SDTYP FUNCSTAT ALAND AWATER INTPTLAT INTPTLON Area_SqMil Shape__Are Shape__Len geometry SchoolsPerDist Teachers Students StudentsPerTeacherDistrictAVG SqMilesPerSchool
31 32 29 None 2911280 Knox Co. R-I School District Knox Co. R-I 052096 052-096 Knox 00 PK 12 G5420 None E 1286030139 7261263 +40.1284564 -092.1560353 499.008667 2.214785e+09 226287.750081 POLYGON ((-92.12443 40.30356, -92.12328 40.303... 2 52 451 8.673077 249.504334
451 452 29 None 2903060 Alton R-IV School District Alton R-IV 075087 075-087 Oregon 00 PK 12 G5420 None E 1274431039 3173474 +36.7420566 -091.3960160 493.142998 1.993678e+09 254588.937796 POLYGON ((-91.65821 36.87360, -91.65847 36.875... 2 69 681 9.869565 246.571499
8 9 29 None 2920700 Scotland Co. R-I School District Scotland Co. R-I 099082 099-082 Scotland 00 PK 12 G5420 None E 983501690 5981916 +40.4641152 -092.1660064 434.965998 1.948965e+09 206693.578301 POLYGON ((-91.94312 40.60583, -91.94316 40.601... 2 73 568 7.780822 217.482999
456 457 29 None 2918460 Lewis Co. C-1 School District Lewis Co. C-1 056017 056-017 Lewis 00 PK 12 G5420 None E 1059711015 8690561 +40.0678744 -091.7538065 412.298632 1.826208e+09 283022.552414 POLYGON ((-91.95068 40.26203, -91.94900 40.262... 2 90 879 9.766667 206.149316
195 196 29 None 2929810 Summersville R-II School District Summersville R-II 107153 107-153 Texas 00 PK 12 G5420 None E 921850199 306241 +37.2133105 -091.6620205 355.886415 1.456554e+09 241855.613900 POLYGON ((-91.64663 37.42274, -91.64651 37.422... 2 43 436 10.139535 177.943208

Data Visualizations

districtsMap = folium.Map([38.318364, -92.412253], tiles='CartoDB Positron', zoom_start=6.5)

# generate choropleth map 
choropleth = folium.Choropleth(
    geo_data=districts,
    data=districts,
    columns=['NAME', 'SchoolsPerDist'],
    key_on='feature.properties.NAME',
    fill_color='Reds', 
    fill_opacity=1, 
    line_opacity=1,
    legend_name='Schools per Schools District',
    highlight=True,
    smooth_factor=0).add_to(districtsMap)
style_function = "font-size: 15px; font-weight: bold"
choropleth.geojson.add_child(
    folium.features.GeoJsonTooltip(['NAME','SchoolsPerDist'], style=style_function, labels=False))

# create a layer control
folium.LayerControl().add_to(districtsMap)
<folium.map.LayerControl at 0x124f3b490>
districtsMap
Make this Notebook Trusted to load map: File -> Trust Notebook